Diametrical Risk Minimization: theory and computations

نویسندگان

چکیده

The theoretical and empirical performance of Empirical Risk Minimization (ERM) often suffers when loss functions are poorly behaved with large Lipschitz moduli spurious sharp minimizers. We propose analyze a counterpart to ERM called Diametrical (DRM), which accounts for worst-case risks within neighborhoods in parameter space. DRM has generalization bounds that independent convex as well nonconvex problems it can be implemented using practical algorithm based on stochastic gradient descent. Numerical results illustrate the ability find quality solutions low error risk landscapes from benchmark neural network classification corrupted labels.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Structured Prediction Theory and Voted Risk Minimization

We present a general theoretical analysis of structured prediction with a series of new results. We give new data-dependent margin guarantees for structured prediction for a very wide family of loss functions and a general family of hypotheses, with an arbitrary factor graph decomposition. These are the tightest margin bounds known for both standard multi-class and general structured prediction...

متن کامل

Learning Theory for Conditional Risk Minimization

In this work we study the learnability of stochastic processes with respect to the conditional risk, i.e. the existence of a learning algorithm that improves its next-step performance with the amount of observed data. We introduce a notion of pairwise discrepancy between conditional distributions at different times steps and show how certain properties of these discrepancies can be used to cons...

متن کامل

Principles of Risk Minimization for Learning Theory

Learning is posed as a problem of function estimation, for which two principles of solution are considered: empirical risk minimization and structural risk minimization. These two principles are applied to two different statements of the function estimation problem: global and local. Systematic improvements in prediction power are illustrated in application to zip-code recognition.

متن کامل

Selecting Computations: Theory and Applications

Sequential decision problems are often approximately solvable by simulating possible future action sequences. Metalevel decision procedures have been developed for selecting which action sequences to simulate, based on estimating the expected improvement in decision quality that would result from any particular simulation; an example is the recent work on using bandit algorithms to control Mont...

متن کامل

Comparison of entropy generation minimization principle and entransy theory in optimal design of thermal systems

In this study, the relationship among the concepts of entropy generation rate, entransy theory, and generalized thermal resistance to the optimal design of thermal systems is discussed. The equations of entropy and entransy rates are compared and their implications for optimization of conductive heat transfer are analyzed. The theoretical analyses show that based on entropy generation minimizat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machine Learning

سال: 2021

ISSN: ['0885-6125', '1573-0565']

DOI: https://doi.org/10.1007/s10994-021-06036-0